Zero-Inflated Models

Count Data
Regression
GLM
Handling count data with an excess of zero counts by combining a count model with a logistic model.

General Principles

Zero-Inflated Regression models are used when the outcome variable is a count variable with an excess of zero counts. These models combine a count model (e.g., Poisson or Negative Binomial) with a separate model for predicting the probability of excess zeros.

Example

Below is an example code snippet demonstrating Bayesian Zero-Inflated Poisson regression using the Bayesian Inference (BI) package. The data represent the production of books in a monastery (y), which is affected by the number of days that individuals work, as well as the number of days individuals drink. This example is based on McElreath (2018).

Code
from BI import bi, jnp

# Setup device ------------------------------------------------
m = bi('cpu', rand_seed = False) # Fixing seed for reproducibility

# Simulated data------------------------------------------------
prob_drink = 0.2  # 20% of days
rate_work = 1     # average 1 manuscript per day

# Sample one year of production
N = 365
drink = m.dist.binomial(1, prob_drink, shape = (N,), sample = True)
y = (1 - drink) * m.dist.poisson(rate_work, shape = (N,), sample = True)

# Setup device------------------------------------------------
m.data_on_model = dict(
    y=jnp.array(y)
)

# Define model ------------------------------------------------
def model(y):
    al = m.dist.normal(1, 0.5, name='al')
    ap = m.dist.normal(-1.5, 1, name='ap')
    p = m.link.inv_logit(ap)
    lambda_ = jnp.exp(al)
    m.dist.zero_inflated_poisson(p, lambda_, obs=y)

# Run MCMC ------------------------------------------------
m.fit(model)

# Summary ------------------------------------------------
m.summary()
jax.local_device_count 16
  0%|          | 0/1000 [00:00<?, ?it/s]warmup:   0%|          | 1/1000 [00:01<17:31,  1.05s/it, 1 steps of size 2.34e+00. acc. prob=0.00]warmup:  12%|β–ˆβ–        | 122/1000 [00:01<00:06, 145.37it/s, 15 steps of size 1.50e-01. acc. prob=0.77]warmup:  25%|β–ˆβ–ˆβ–Œ       | 251/1000 [00:01<00:02, 312.09it/s, 7 steps of size 6.96e+00. acc. prob=0.78] warmup:  39%|β–ˆβ–ˆβ–ˆβ–‰      | 389/1000 [00:01<00:01, 496.05it/s, 3 steps of size 7.82e-01. acc. prob=0.79]sample:  53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž    | 526/1000 [00:01<00:00, 668.01it/s, 3 steps of size 6.76e-01. acc. prob=0.91]sample:  67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹   | 667/1000 [00:01<00:00, 830.28it/s, 5 steps of size 6.76e-01. acc. prob=0.90]sample:  81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  | 809/1000 [00:01<00:00, 969.46it/s, 7 steps of size 6.76e-01. acc. prob=0.89]sample:  95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 951/1000 [00:01<00:00, 1082.52it/s, 7 steps of size 6.76e-01. acc. prob=0.90]sample: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1000/1000 [00:01<00:00, 558.63it/s, 3 steps of size 6.76e-01. acc. prob=0.89]
arviz - WARNING - Shape validation failed: input_shape: (1, 500), minimum_shape: (chains=2, draws=4)
mean sd hdi_5.5% hdi_94.5% mcse_mean mcse_sd ess_bulk ess_tail r_hat
al 0.15 0.09 0.03 0.30 0.01 0.00 185.03 264.67 NaN
ap -0.74 0.23 -1.08 -0.34 0.02 0.01 179.61 283.56 NaN
library(BayesianInference)

# setup platform------------------------------------------------
m=importBI(platform='cpu')

# Simulate data ------------------------------------------------
prob_drink = 0.2  # 20% of days
rate_work = 1     # average 1 manuscript per day
# sample one year of production
N = as.integer(365)
drink = bi.dist.binomial(total_count = as.integer(1), probs = prob_drink, shape = c(N), sample = T ) # An example of sampling a distribution with BI
y = (1 - drink) *  bi.dist.poisson(rate_work, shape = c(N), sample = T)
data = list()
data$y = y
m$data_on_model = data

# Define model ------------------------------------------------
model <- function(y){
  al = bi.dist.normal(1, 0.5, name='al', shape=c(1))
  ap = bi.dist.normal(-1, 1, name='ap', shape=c(1))
  p = m$link$inv_logit(ap)
  lambda_ = jnp$exp(al)
  bi.dist.zero_inflated_poisson(p, lambda_, obs=y)
}

# Run MCMC ------------------------------------------------
m$fit(model) # Optimize model parameters through MCMC sampling

# Summary ------------------------------------------------
m$summary() # Get posterior distribution
using BayesianInference

# Setup device------------------------------------------------
m = importBI(platform="cpu")

# Simulated data ------------------------------------------------
prob_drink = 0.2  
rate_work = 1     

# Sample one year of production
N = 365

# Note: Use lowercase 'true' for booleans in Julia
# 'drink' will be a Python/JAX object
drink = m.dist.binomial(1, prob_drink, shape=(N,), sample=true)

# Math works automatically because 'drink' is a Py object 
# and we taught Julia how to handle Py arithmetic in the previous step
y = (1 - drink) * m.dist.poisson(rate_work, shape=(N,), sample=true)

# Send data to BI class object ------------------------------------------------
m.data_on_model = pydict(Dict("y" =>y))


# Define model ------------------------------------------------
@BI function model(y)
    al = m.dist.normal(1, 0.5, name="al")
    ap = m.dist.normal(-1.5, 1, name="ap")
    p = m.link.inv_logit(ap)
    lambda_ = jnp.exp(al)
    m.dist.zero_inflated_poisson(p, lambda_, obs=y)
end

# Run mcmc ------------------------------------------------
m.fit(model)  # Optimize model parameters through MCMC sampling

# Summary ------------------------------------------------
m.summary() # Get posterior distributions

Mathematical Details

We model the relationship between the independent variable X and the count outcome variable Y. The model assumes that the data-generating process has two states: a β€œzero” state that generates only zero counts, and a β€œPoisson” state that generates counts from a Poisson distribution.

The overall model has two main linear models (\pi_i and \lambda_i):

Y_i \sim \text{ZIPoisson}(\pi_i, \lambda_i)

Where the mass probability mass function \text{ZIPoisson}(\pi_i, \lambda_i) is given by:

P(Y_i | \pi_i, \lambda_i) = \begin{cases} \pi_i + (1 - \pi_i) \times \text{Poisson}(0 | \lambda_i) & \text{if } Y_i = 0 \\ (1 - \pi_i) \times \text{Poisson}(Y_i | \lambda_i) & \text{if } Y_i > 0 \end{cases}

  1. \pi_i A logistic regression model to predict the probability of a structural zero.

  2. \lambda_i is a Poisson regression conditional on the outcome not resulting from a structural zero.

Jointly these two parameters allow us to estimate the probability of getting a zero (Y_i = 0) as the sum of two processes: the probability of a structural zero (\pi_i) plus the probability of not being a structural zero (1 - \pi_i) and then getting a zero from the Poisson distribution anyway.

The parameters \pi_i and \lambda_i can be modeled as functions of an independent variable X_i:

\text{logit}(\pi_i) = \alpha_\pi + \beta_\pi X_i

\log(\lambda_i) = \alpha_ \lambda + \beta_\lambda X_i

We can assign prior distributions to the model parameters with weakly informative Normal priors:

\alpha_\pi \sim \text{Normal}(0, 1)

\beta_\pi \sim \text{Normal}(0, 1)

\alpha_\lambda \sim \text{Normal}(0, 1)

\beta_\lambda \sim \text{Normal}(0, 1)

Reference(s)

McElreath, Richard. 2018. Statistical Rethinking: A Bayesian course with examples in R and Stan. Chapman; Hall/CRC.